12 research outputs found

    Analysis of schema structures in the Linked Open Data graph based on unique subject URIs, pay-level domains, and vocabulary usage

    Get PDF
    The Linked Open Data (LOD) graph represents a web-scale distributed knowledge graph interlinking information about entities across various domains. A core concept is the lack of pre-defined schema which actually allows for flexibly modelling data from all kinds of domains. However, Linked Data does exhibit schema information in a twofold way: by explicitly attaching RDF types to the entities and implicitly by using domain-specific properties to describe the entities. In this paper, we present and apply different techniques for investigating the schematic information encoded in the LOD graph at different levels of granularity. We investigate different information theoretic properties of so-called Unique Subject URIs (USUs) and measure the correlation between the properties and types that can be observed for USUs on a large-scale semantic graph data set. Our analysis provides insights into the information encoded in the different schema characteristics. Two major findings are that implicit schema information is far more discriminative and that applications involving schema information based on either types or properties alone will only capture between 63.5 and 88.1 % of the schema information contained in the data. As the level of discrimination depends on how data providers model and publish their data, we have conducted in a second step an investigation based on pay-level domains (PLDs) as well as the semantic level of vocabularies. Overall, we observe that most data providers combine up to 10 vocabularies to model their data and that every fifth PLD uses a highly structured schema

    Modeling and Simulation of Quality of Service for Composite Web Services

    No full text
    As businesses begin to link Web services to create new functionality in the form of composite Web services, known as Web processes, it will become increasingly important to have a way of measuring their quality of service (QoS). To this end, we present a method that uses a predictive QoS model to compute the QoS for Web processes in terms of performance, cost and reliability. The ability to compute QoS for a Web process enables an organization to tune the process. Tuning Web processes presents an interesting problem. During the act of tuning, a business may want to explore many different configurations of the Web process in order to answer "what-if" questions. Composing and evaluating the QoS for many different configurations may be prohibitive in terms of time and costs. We present a simulation based technique to overcome this challenge to tuning Web processes

    Modeling and Simulation of Quality of Service for Composite Web Services

    No full text
    As businesses begin to link Web services to create new functionality in the form of composite Web services, known as Web processes, it will become increasingly important to have a way of measuring their quality of service (QoS). To this end, we present a method that uses a predictive QoS model to compute the QoS for Web processes in terms of performance, cost and reliability. The ability to compute QoS for a Web process enables an organization to tune the process. Tuning Web processes presents an interesting problem. During the act of tuning, a business may want to explore many different configurations of the Web process in order to answer “what-if” questions. Composing and evaluating the QoS for many different configurations may be prohibitive in terms of time and costs. We present a simulation based technique to overcome this challenge to tuning Web processes

    Database foundations for scalable RDF processing

    No full text
    As more and more data is provided in RDF format, storing huge amounts of RDF data and efficiently processing queries on such data is becoming increasingly important. The first part of the lecture will introduce state-of-the-art techniques for scalably storing and querying RDF with relational systems, including alternatives for storing RDF, efficient index structures, and query optimization techniques. As centralized RDF repositories have limitations in scalability and failure tolerance, decentralized architectures have been proposed. The second part of the lecture will highlight system architectures and strategies for distributed RDF processing. We cover search engines as well as federated query processing, highlight differences to classic federated database systems, and discuss efficient techniques for distributed query processing in general and for RDF data in particular. Extracting knowledge from the Web is an excellent showcase — and potentially one of the biggest challenges — for the scalable management of uncertain data we have seen so far. The third part of the lecture is intended to provide a close-up on current approaches and platforms to make reasoning (e.g., in the form of probabilistic inference) with uncertain RDF data scalable to billions of triples
    corecore